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© Erroneous input processing method and apparatus in an information processing system using 
composite input 



© A user inputs voice through a voice recognition 
program (13), a microphone (8) and an A/D con- 
verter (7) while pointing by use of a pointing gesture, 
touch pen or the like with reference to an image 
displayed on a display unit (4). For the result of 
recognition of the inputted voice, a processing or 
display indicated by a candidate having the first rank 
of reliability of recognition is performed and an in- 
^ dication showing a plurality of candidates having the 
lO second rank and the lower ranks than that is dis- 
W played in a menu form on a display screen (21). In 
® the case where there is an error (that is, in the case 
O where the processing or display indicated by the 
00 candidate having the first rank is not a processing 
intended by the user or the user makes an erro- 
O neous input), the error is corrected in such a manner 

Q. 

LU 



that a correct input candidate is selected by a finger, 
pen or the like from the displayed menu of can- 
didates having the second rank and the lower ranks 
than that and a processing operation or display 
associated with the selected candidate is performed 
again. Information being redundant or duplicative as 
compared with the time of processing of the can- 
didate having the first rank is held in a system as it 
is, thereby reducing a step in which the selection by 
the user from the candidates of recognition must be 
made at least one or more times or a labor in which 
the input by the user must be made again. The 
result of the processing desired by the user can be 
obtained simply by inputting only the correct input 
candidate again, thereby providing an interactive 
system which is natural and easy to use. 
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BACKGROUND OF THE INVENTION 

The present invention relates to a user inter- 
face such as a graphic editing system using voice 
or a voice application system having a display 
screen which system is mounted on office automa- 
tion (OA) equipment such as a personal computer, 
work station, word processor or the like. The form 
of input is not limited to the voice and may include 
an input method in which a directly obtained input 
signal is taken in a system once and a designated 
input is then settled through a recognition process- 
ing. The present invention provides simple means 
for processing an unintended error at the time of 
input in an information processing system which 
has input means in a composite form including a 
voice input as input means. 

The present invention is directed to the pro- 
cessing of an erroneous input in a processing 
which uses voice input information and another 
input information in a composite form. The pro- 
cessing for correction of an erroneous input in such 
a composite input form does not exist in the prior 
art. Accordingly, an analogous technique concern- 
ing an error processing will now be shown using an 
example of the processing for correction of an 
erroneous input in the case where voice input 
means is used for the input of an instruction. 

In the conventional apparatus having a plurality 
of input means consisting of a voice input, the 
voice input is merely used in lieu of a keyboard 
input or the like. 

An example of a system using voice input 
means and performing a processing in accordance 
with the reliability of recognition of voice includes a 
ticket vending machine installed at the Shinagawa 
station of JR East Japan Railway Company which 
uses a voice recognition input and a touch panel 
input. The ticket vending machine recognizes an 
inputted voice. In the case where the reliability of 
recognition of a candidate having the first rank of 
reliability as the result of recognition of the inputted 
voice is high, a processing is performed as it is. On 
the other hand, in the case where the reliability of 
recognition of the candidate having the first rank of 
reliability as the result of recognition of the inputted 
voice is low, an actual ticket issuance processing is 
performed after the candidate having the first rank 
of reliability and other candidates have been pre- 
sented to a user so that the correct result of 
recognition from the presented candidates is se- 
lected by the user through a touch panel input or 
after a correct indication has been inputted by the 
user again. 

The above-mentioned prior art has a problem 
that the processing of confirmation for the user 
must be performed or the input must be made 
again and hence a considerable time is required for 



the input of information. Further, there is a problem 
that even in the case where the reliability of rec- 
ognition of inputted voice is low but the recognition 
agrees with the user's intention, the user is re- 

5 quested to confirm the result of recognition of 
inputted voice, that is, the user is forced to perform 
a troublesome operation. 

Also, in the case where an inputted voice is 
recognized erroneously with a high reliability and a 

10 processing started on the basis of the inputted 
voice has been performed, it is not possible to 
correct that processing or there are required the 
cancellation of the entire processing and the suc- 
ceeding input operation which is to be done over 

75 again from the beginning. 

SUMMARY OF THE INVENTION 

An object of the present invention is to provide 
20 an interface for graphic edition, image edition or 
the like which is for a system in which the indica- 
tion of an operation is inputted in a composite form 
using a voice input and another input method (for 
example, an input indicated by a touch panel, an 
25 input using a keyboard or an input using a mouse) 
and is capable of processing an error in composite 
input simply without affecting the other input in- 
formation. 

Another object of the present invention is to 

30 provide an interface suitable for the use thereof for 
graphic edition, image edition or the like which is 
accompanied by a voice input as input means 
using the recognition of voice. 

A further object of the present invention is to 

35 provide an input method and apparatus, in an in- 
formation processing system using a composite 
input operation including a voice input and a touch 
panel input, which is capable of making the correc- 
tion of a processing simply and rapidly even if a 

40 processing contrary to the instruction of an opera- 
tion to be inputted for the system has been per- 
formed due to an operation based on the erroneous 
recognition of inputted voice. 

To that end, the present invention provides an 

45 information processing system using voice com- 
prising at least information displaying means for 
displaying information, position information input- 
ting means for inputting continuous position in- 
formation by a user through a pointing gesture or 

50 the like, voice information inputting means for in- 
putting voice information, storing means for storing 
the position information and the voice information 
inputted by the inputting means, standard pattern 
information storing means for storing at least one of 

55 voice standard pattern, word information and gram- 
mar information, and voice information analyzing 
means for determining the reliability of voice input- 
ted by the inputting means by use of the at least 
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one of voice standard pattern, word information and 
grammar information, the system being provided 
with error processing means by which in the case 
where a processing or display determined by a 
candidate having the first rank of reliability of rec- 
ognition as the result of recognition of voice is first 
performed while a menu of plural candidates hav- 
ing the second rank of reliability of recognition and 
the lower ranks than that as the result of recogni- 
tion of voice is displayed on a display screen and 
the processing or display determined by the can- 
didate having the first rank of reliability is erro- 
neous or the user makes an erroneous input, a 
correct input candidate is selected by a finger, pen 
or the like from the displayed menu of candidates 
having the second rank of reliability of recognition 
and the lower ranks than that so that a processing 
operation or display associated with the selected 
candidate is performed again. 

In order that it is not required that information 
other than a voice input and inclusive of pointing 
information be inputted again at the time of correc- 
tion in the case where the candidate of recognition 
of voice is selected, there is provided error pro- 
cessing means for storing pointing information 
which has already been inputted. 

In the case where there is no correct candidate 
in the displayed menu, error processing means for 
enabling correction by inputting only necessary 
information by voice again is provided. 

In the case where a voice input is made again, 
there is provided means by which the candidate 
having the first rank of reliability of recognition as 
the result of recognition of voice and the menu- 
displayed candidates having the second rank of 
reliability of recognition and the lower ranks than 
that are eliminated from an object of recognition. 

There is provided a function of performing the 
processing or display determined by the candidate 
having the first rank of reliability of recognition and 
displaying the content of the result of recognition 
itself on the display screen or outputting the con- 
tent of the result of recognition by voice. 

There is provided an image editing system or 
the like in which an editing operation is performed 
in such a manner that a user inputs a command 
such as "ido" (move), "fukusha" (copy) and so 
forth by voice and indicates an object, a moving 
position and so forth by a finger, pen or the like, 
the system being provided with error processing 
means by which in the case where the input of 
information is followed by performing an operation 
based on a command which is a candidate having 
the first rank of reliability of recognition of voice 
while displaying a menu of candidates having the 
second rank of reliability of recognition of voice 
and the lower ranks than that and the operation 
based on the candidate having the first rank of 



reliability of recognition of voice is erroneous or the 
user makes an erroneous input, information other 
than voice and inclusive of pointing information 
having already been inputted is stored so that a 

5 selection to the menu-displayed plural candidates 
having the second rank of reliability of recognition 
and the lower ranks than that is only made by the 
finger, pen or the like to perform a processing 
operation or display associated with the selection 

w again. 

The above and further advantages of the 
present invention will become apparent to those 
skilled in the art upon reading and understanding 
the following detailed description of the preferred 
is and alternate embodiments. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention will be described in con- 
20 junction with certain drawings which are for the 
purpose of merely illustrating the preferred and 
alternate embodiments of the invention and not for 
the purpose of limiting the same, and wherein: 
Fig. 1 is a block diagram showing the construc- 
ts tion of a system according to the present inven- 
tion; 

Fig. 2 shows an example of an image displayed 

on the display screen of a display unit; 

Fig. 3 shows an example of a figure depicting 
30 table; 

Fig. 4 shows an example of the structure of a 

voice recognition program; 

Fig. 5 shows an example of an image displayed 

on the display screen of the display unit; 
35 Fig. 6 shows an example of the data structure of 

a pointing area table; 

Fig. 7 shows an example of the data structure of 
a word dictionary; 

Fig. 8 shows an example of an image displayed 
40 on the display screen of the display unit; and 

Fig. 9 is a flow chart showing the outline of a 
processing operation of the present invention. 

DESCRIPTION OF THE PREFERRED AND AL- 
45 TERNATE EMBODIMENTS 

Embodiments of the present invention will now 
be described using the accompanying drawings. 
The description will be made supposing a graphic 

so editing system in which an input is made in a 
composite form. However, the present invention is 
limited to such a system and is generally ap- 
plicable to a CAD system, image processing sys- 
tem, information retrieval system and so forth. 

55 Fig. t is a block diagram showing an embodi- 

ment of the present invention. In Fig. 1. a system 
program 11, a graphic editing program 12. a voice 
recognition program 13, a pointing area reading 
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program 14, an information combining program 15, 
voice standard pattern data 16 and a word dic- 
tionary 17 provided in a disk are loaded to a main 
storage 2 at the time of start-up of the system. Fig. 
2 shows an example of an image for graphic edi- 
tion displayed on a display unit 4 through the 
graphic editing program 12 loaded to the main 
storage 2. In Fig. 2, two circles 22, two triangles 23 
and three rectangles 24 are depicted or displayed 
in a graphic mode on a display screen 21 by 
starting the graphic editing program 12 and refer- 
ring to a figure depicting table 30 (see Fig. 30) 
stored in the main storage 2. 

In the present invention, a user points at one of 
displayed objects on the 'display screen to des- 
ignate the one object and performs an editing work 
for the designated object. The editing work is in- 
dicated by a voice input. For the edition processing 
in the system, an information processor 1 first 
starts the voice recognition program 13 in the main 
storage 2 and further starts the pointing area read- 
ing program 14. The input of position information is 
possible in such a manner that a pointing operation 
based on the pointing area reading program 14 is 
performed on a touch panel 5 disposed corre- 
sponding or opposite to the display unit 4. The 
details of the pointing area reading program 14 will 
be explained later on. The display unit 4 is con- 
trolled by a display controller 6. 

Fig. 9 is a flow chart showing the outline of an 
example of the operation of the present invention. 
Steps shown in Fig. 9 will now be explained suc- 
cessively. 

As shown in Fig. 4, the voice recognition pro- 
gram 13 includes a voice input program 131, a 
characteristic (or feature) extraction program 1 32, a 
standard pattern matching program 133 and a dic- 
tionary matching program 134. With the start of the 
voice recognition program 13, the voice input pro- 
gram 131 is first started. A user indicates an ed- 
iting operation by a voice input using a microphone 
8 while indicating an object, a moving position and 
so forth on the touch panel 5 (step 901). Receiving 
position information inputted from the touch panel 5 
and voice information of editing instruction, the 
graphic editing system understands the user's in- 
tention from the received information and performs 
a graphic edition in accordance with the editing 
instruction based on the voice information. The 
present embodiment will be mentioned in conjunc- 
tion with the case where the user successively 
indicates a point A in the vicinity of a circle as an 
object and a point B representative of a copying 
position while saying "Kono en o kochira ni 
copishite" ("Copy this circle to this place" in Eng- 
lish representation) for the microphone 8, as shown 
in Fig. 5. With the start of the voice input program 
131, the voice inputted from the microphone 8 is 



converted by an A/'D converter 7 into a digital 
signal and is thereafter sent to the main storage 2 
for subjection to the succeeding processing (step 
902). Next, the characteristic extraction program 

5 132 is started so that the digital signal correspond- 
ing to the indication by the inputted voice is con- 
verted into a time series of LPC cepstrum coeffi- 
cients as a characteristic (or feature) vector at a 
frame period of 10 ms (step 903). (An example of 

io conversion of characteristic vector has been dis- 
closed by "ELEMENTS OF VOICE INFORMATION 
PROCESSING" written by Saito and Nakata and 
published by the OHMsha Ltd., on 1981). At this 
time, buffer memoriy P provided in the main stor- 

75 age 2 is reset to zero. Based on the pointing area 
reading program 14, information of touch coordi- 
nates (X, Y), when the finger tip of the user, a pen 
or the like touch the touch panel 5, is taken in 
through a panel controller 3. The buffer memory P 

20 is incremented each time the coordinate informa- 
tion is taken in. The taken-in coordinate information 
is written into a pointing area table of the main 
storage 2. The pointing area table includes array 
memories X, Y and T. The x-coordinate value of 

25 the taken-in coordinate information, the y-coordi- 
nate value thereof and the instant of time of input 
. of the coordinate information are written into the 
array memories X[P], Y[P] and T[P], respectively. 
As shown in Fig. 6, the pointing area table includes 

30 a coordinate number 200, the array memory X 201 
in which the x-coordinate value is written, the array 
memory Y 202 in which the y-coordinate value is 
written and the array memory T in which the time 
instant of input of the coordinate information is 

35 written. Data of the x-coordinate and y-coordinate 
of the finger touching the panel and the time in- 
stant of input of the coordinate information are 
stored into the respective memories starting from 
the coordinate number "1" in the order of input 

40 (step 904). When a certain fixed time To lapses 
after the departure of the finger tip, pen or the like 
from the touch panel 5, the writing operation is 
completed. Even in the case where another means 
is used, the writing operation is similarly completed 

45 upon lapse of the fixed time. 

Upon completion of the operation in which the 
coordinate information and the editing instruction 
inputted through the pointing and utterance by the 
user are written into the pointing area table, the 

50 standard pattern matching program 133 and the 
dictionary matching program 134 are started. At 
the time of start of the dictionary matching program 
134. the reference to the word dictionary 17 is 
made. As shown in Fig. 7, the word dictionary 17 

55 includes a word 191, the content of word 192 and a 
concept number 193. The concept number 193 is 
an identification number for classifying words which 
have similar meanings. First, a matching is made 
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between the characteristic vector which is obtained 
from the inputted voice and the voice standard 
pattern data 16 which is stored in the system 
beforehand. The matching can be made using, for 
example, a method disclosed by Kitahara et al, 5 
"EXAMINATION OF COLLOQUIAL SENTENCE 
ACCEPTING METHOD IN INFORMATION RE- 
TRIEVAL SYSTEM USING VOICE INPUT" (Acous- 
tical Society of Japan, 3-5-7 (1991)). As the result 
of matching, the inputted voice is converted into a io 
string of characters (step 905). An example of the 
character string is "kono/en/o/kochira/ni/ido/shite" 
(corresponding to "move/this/circle/to/this/place" in 
English representation). At this time, the reliability 
of recognition is determined for each form element 15 
(or word partitioned by hyphen "/") to make the 
ranking of candidates. In addition, the analysis of 
form elements of the character string obtained by 
the matching is made using a known method, for 
example, a longest matching method disclosed by 20 
Aizawa et al. "KANA/KANJI TRANSFORMATION 
USING COMPUTER" (NHK Technical Report, 25, 5 
(1973)) and the matching with the word dictionary 
17 is made, thereby obtaining form element in- 
formation such as ("this", demonstrative. 803), 25 
("en", noun, 501), ("o", attached word, 804), 
("kochira", noun, 901), ("ni". attached word, 805) 
and ("idoshite", verb, 301). The verb is provided 
with a command number Com[i] (i = 1 to n) in the 
order of decreasing ranks of reliability of recogni- 30 
tion (that is, the highest rank comes first) (step 
906). In the shown example, "ido" (move) has 
Com[1) = 301. Next, the information combining 
program 15 is started to make a time correspon- 
dence between the order of input of nouns having 35 
the concept number of 500's and nouns having the 
concept number of 900's and the order of input of 
a plurality of pointings, as disclosed by, for exam- 
ple, Japanese Patent Application No. 04-221234 by 
Kitahara et al entitled "SYSTEM FOR INPUT IN 40 
COMPOSITE FORM" (step 907). In the shown ex- 
ample, since the input of the noun "en" concerning 
an object is earlier than that of the noun "kochira" 
concerning a position, the coordinate number A 
indicates an object and the coordinate number B 45 
indicates a moving position. Next, the matching of 
the concept number (913 in Fig. 7) of the object 
with three upper digits of a figure number (303 in 
Fig. 3) in the figure depicting table 30 is made to 
extract a candidate figure. A candidate figure on 50 
the display screen obtained in the present embodi- 
ment is extracted as numbers 5011 and 5012 in 
the figure depicting table 30. Next, a circle having 
center coordinates nearest to a coordinate number 
"A" (XA, YA) indicating the position of the object 55 
noun obtained from the inputted voice is settled as 
a figure which is an object of indication, and the 
contour of that figure is flickered. In the case of the 



shown example, a circle 51 A in Fig. 5 corresponds 
to the figure number 501 1 in Fig. 3 and hence the 
figure number 5011 is recognized as being the 
candidate figure. The recognized figure number is 
sequentially stored in the form of obj[1] = 5011 
(step 908). Information having already been stored 
through pointing and concerning the object and the 
copying position is stored until the next pointing is 
inputted into the graphic editing area. Next, in the 
case where the reliability of recognition of can- 
didate of the verb as the form element information 
is ranked in the order of "ido" (move), "fukusha" 
(copy) and "kokkan" (or exchange), the command 
number is inputted as Com[1] = 301, Com£2] = 
302, — . First, an "ido" (move) operation of Com[1] 
= 301 having the highest or first rank of recogni- 
tion of candidate is performed (step 909). The 
selected circle is moved to the coordinate number 
"B" (XB, YB) on the main storage 2 which is the 
indicated position. At this time, the result of the 
above operation and a command name of the per- 
formed operation are displayed on the display 
screen, as shown in Fig. 8. Further, a menu of 
candidates of recognition of the inputted voice con- 
cerning the operation command and having the; 
second and lower ranks of reliability is displayed * 
on the same display screen (step 910). The num- 
ber of candidates to be displayed may be limited 
to a predetermined number or only candidates 
having a predetermined rank and higher ranks tharv. 
that may be displayed. In the shown example, the-- 
operation command is specified as "ido" (move) on 
the basis of the verb recognized from the. inputted 
voice. However, in the case where the user actually 
requests not "ido" (move) but "fukusha" (copy), 
there results in an erroneous processing which is 
caused from not the result of an operation desired 
by the user but the result of recognition of the 
inputted voice. In that case, therefore, "fukusha" 
(copy), which is a command intended by the user 
himself or herself, is selected by the user on the 
touch panel 5 from the menu of operation com- 
mand candidates with the second and lower ranks 
of reliability displayed on the display screen so that 
the erroneous processing is corrected with a high 
efficiency (step 911). When an operation command 
or menu item intended by the user is selected from 
the candidate menu having the second rank and 
the lower ranks than that, a coordinate area and a 
pointing position for each item of the candidate 
menu are checked to select Com[2] = 302. Next, 
the previously performed "ido" (move) operation is 
cancelled but commonly available pointing informa- 
tion used at the time of erroneous processing is 
held in the main storage 2 as it is and the held 
pointing information can be used again in a pro- 
cessing which is to be performed afte* collection of 
the erroneous processing. Using the ht.-Kl pointing 
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information, a "fukusha" (copy) operation is per- 
formed in accordance with Com[2] = 302 (step 
912). As the result of correction of erroneous pro- 
cessing, the object is copied to the coordinate 
number "B" (XB, YB) on the main storage 2. On 5 
the other hand, in the case where no corresponding 
candidate of the command exists in the displayed 
menu, only the name of a processing command is 
inputted by voice again so that the processing 
command is inputted through a voice recognition w 
processing by use of candidates of recognition 
excepting the previously displayed candidate hav- 
ing the first rank of reliability of recognition and the 
previously menu-displayed candidates having the 
second and lower ranks of reliability of recognition. 75 

In a system of the present invention, as men- 
tioned above, in which a processing is performed 
using a plurality of inputs in a composite form, the 
input of only contents to be subjected to correction 
suffices at the time of correction of erroneous pro- 20 
cessing, thereby making it possible to save a labor 
for inputting redundant or duplicative data again. 
When a voice input is made again, a different or 
wrong candidate can be eliminated surely, thereby 
enabling a high-efficient recognition processing. 25 

The present invention provides the following 
effects. 

When a user inputs information by use of voice 
input means and another input means, a system 
performs a processing or display determined by a 30 
candidate having the first rank of reliability as the 
result of recognition of voice and displays on a 
display screen a menu of plural candidates which 
have the second rank of reliability and the lower 
ranks than that as the result of recognition of voice. 35 
In the case where the processing or display based 
on the candidate having the first rank of reliability 
is erroneous or the user makes an erroneous input, 
an error processing is performed in which a correct 
input candidate is selected by a finger, pen or the 40 
like from the displayed menu of candidates having 
the second rank of reliability and the lower ranks 
than that so that a processing operation or display 
associated with the selected candidate is per- 
formed again. As a result, it is possible to perform 45 
the error processing simply. Also, in order that it is 
not required that information other than a voice 
input and inclusive of pointing information be input- 
ted again at the time of correction in the case 
where the candidate of recognition of voice is se- 50 
lected, error processing means for storing pointing 
information having already been inputted is pro- 
vided. As a result, in the case where the user has 
inputted information other than voice as well as a 
voice input, a need to input the correctly inputted 55 
information again is eliminated or the input of cnly 
erroneous information suffices. Further, in the case 
where there is no correct candidate in the dis- 



played menu, an error processing for correcting the 
erroneous recognition is performed in such a man- 
ner that the candidate having the first rank of 
reliability of recognition as the result of recognition 
of voice and the menu-displayed candidates having 
the second rank of reliability of recognition and the 
lower ranks than that are eliminated from an object 
of recognition and only necessary information is 
inputted by voice again. As a result, it is possible 
to improve the reliability of recognition by thus 
reducing the number of candidates of recognition. 
Furthermore, in an image editing system or the like 
in which an editing operation is performed in such 
a manner that a user inputs a command such as 
u ido" (move), "fukusha" (copy) and so forth by 
voice and indicates an object, a moving position 
and so forth by a finger, pen or the like, information 
other than voice inclusive of pointing information 
having already been inputted is stored in the case 
where the input of information is followed by per- 
forming an operation based on a command having 
the first rank of reliability of recognition of voice 
while displaying a menu of candidates having the 
second rank of reliability of recognition of voice 
and the lower ranks than that and the operation 
based on the candidate having the first rank of 
reliability of recognition of voice is erroneous or the 
user makes an erroneous input. Thereby, it is pos- 
sible to reduce the number of error processing 
steps in such a manner that a selection to the 
menu-displayed plural candidates having the sec- 
ond rank of reliability of recognition and the lower 
ranks than that is only made by the finger, pen or 
the like to perform a processing operation or dis- 
play associated with the selection again. Also, a 
function of performing the processing or display 
determined by the candidate having the first rank 
of reliability of recognition and displaying the con- 
tent of the result of recognition itself on the display 
screen or outputting the content of the result of 
recognition by voice is provided, thereby making it 
possible for the user to confirm the result of rec- 
ognition. 

The present invention has been described with 
reference to the preferred and alternate embodi- 
ments. Obviously, modifications and alternations 
will occur to those skilled in the art upon reading 
and understanding the present invention. It is in- 
tended that the invention be construed as including 
all such modifications and alternations in so far 
they come with the scope of the appended claims 
or the equivalent thereof. 

Claims 

1. An information processing system using voice 
comprising at least information displaying 
means (4) for displaying information, position 
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information inputting means (5) for inputting 
continuous position information by a user 
through a pointing gesture or the like, voice 
information inputting means (7, 8) for inputting 
voice information, storing means (2) for storing 5 
the position information inputted by said posi- 
tion information inputting means and the voice 
information inputted by said voice information 
inputting means, standard pattern information 
storing means (16. 17) for storing at least one ro 
of voice standard pattern, word information and 
grammar information, and voice information 
analyzing means (133) for determining the re- 
liability of voice inputted by said voice informa- 
tion inputting means by use of the at least one 75 
of voice standard pattern, word information and 
grammar information, the system being pro- 
vided with error processing means (3, 6, 13- 
(131-134), 14) by which in the case where a 
processing or display determined by a can- 20 
didate having the first rank of reliability of 
recognition as the result of recognition of voice 
is first performed while a menu of plural can- 
didates having the second rank of reliability of 
recognition and the lower ranks than that as 25 
the result of recognition of voice is displayed 
on a display screen (21) and the processing or 
display determined by the candidate having 
the first rank of reliability is erroneous or the 
user makes an erroneous input, a correct input 30 
candidate is selected by a finger, pen or the 
like from the displayed menu of candidates 
having the second rank of reliability of recogni- 
tion and the lower ranks than that so that a 
processing operation or display associated 35 
with the selected candidate is performed again. 

2. An information processing system using voice 
according to Claim 1, wherein said error pro- 
cessing means stores pointing information hav- 40 
ing already been inputted in order that it is not 
required that information other than a voice 
input and inclusive of pointing information be 
inputted again at the time of correction in the 
case where the candidate of recognition of 45 
voice is selected. 

3. An information processing system using voice 
according to Claim 1, wherein in the case 
where there is no correct candidate in the 50 
displayed menu, said error processing means 
makes it possible to make correction by input- 
ting only necessary information by voice again. 

4. An information processing system using voice 55 
according to Claim 1, wherein in the case 
where a voice input is made again, the can- 
didate having the first rank of reliability of 
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recognition as the result of recognition of voice 
and the menu-displayed candidates having the 
second rank of reliability of recognition and the 
lower ranks than that are eliminated from an 
object of recognition. 

5. An information processing system using voice 
according to Claim 1, wherein the system has 
a function of performing the processing or dis- 
play determined by the candidate having the 
first rank of reliability of recognition and dis- 
playing the content of the result of recognition 
itself on the display screen or outputting the 
content of the result of recognition by voice. 

6- An information processing system using voice 
in an image editing system or the like which 
includes the information processing system us- 
ing voice according to Claim 1 and in which an 
editing operation is performed by inputting a 
command such as "move", "copy" and so 
forth through voice by a user and indicating an 
object, a moving position and so forth by a 
finger, pen or the like, the system being pro- 
vided with error processing means by which in 
the case where the input of information is 
followed by performing an operation based on 
a command which is a candidate having the 
first rank of reliability of recognition of voice 
while displaying a menu of candidates having 
the second rank of reliability of recognition of 
voice and the lower ranks than that and the 
operation based on the candidate having the 
first rank of reliability of recognition of voice is 
erroneous or the user makes an erroneous 
input, information other than voice and inclu- 
sive of pointing information having already 
been inputted is stored so that a selection to 
the menu-displayed plural candidates having 
the second rank of reliability of recognition and 
the lower ranks than that is only made by the 
finger, pen or the like to perform a processing 
operation or display associated with the selec- 
tion again. 

7. An erroneous input correcting method in a 
composite input information processing sys- 
tem, comprising the steps of: 

inputting an editing instruction indicative of 
the modification in shape or change in position 
of a displayed object by voice while designat- 
ing the displayed object directly: 

storing position information of the object; 
recognizing the inputted voice and per- 
forming an editing instruction having the first 
rank of reliability of recognition: 

displaying the result of performance and a 
menu of editing instructions having the second 
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rank of reliability of recognition as the result of 
recognition of voice and the lower ranks than 
that; 

selecting a processing instruction of the 
menu of editing instructions having the second 5 
rank of reliability of recognition and the lower 
ranks than that; and 

executing said processing instruction for 
said position information. 
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